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DO  PEOPLE  MEAN  WHAT  THEY  SAY? 

IMPLICATIONS  FOR  SUBJECTIVE  SURVEY  DATA 

Marianne  Bertrand  and  Sendhil  Mullainathan* 

Many  surveys  contain  a  wealth  of  subjective  questions  that  are  at  first  glance  rather  excit- 
ing. Examples  include  "How  important  is  leisure  time  to  you?",  "How  satisfied  are  you  with 
yourself?",  or  "How  satisfied  are  you  with  your  work?"  Yet  despite  easy  availability,  this  is 
one  data  source  that  economists  rarely  use.  In  fact,  the  unwillingness  to  rely  on  such  questions 
marks  an  important  divide  between  economists  and  other  social  scientists. 

This  neglect  does  not  come  from  disinterest.  Most  economists  would  probably  agree  that 
the  variables  these  questions  attempt  to  uncover  are  interesting  and  important.  But  they  doubt 
whether  these  questions  elicit  meaningful  answers.  These  doubts  are,  however,  based  on  a  priori 
skepticism  rather  than  on  evidence.  This  ignores  a  large  body  of  experimental  and  empirical 
work  that  has  investigated  the  meaningfulness  of  answers  to  these  questions.  Our  primary 
objective  in  this  paper  is  to  summarize  this  literature  for  an  audience  of  economists,  thereby 
turning  a  vague  implicit  distrust  into  an  explicit  position  grounded  in  facts.  Having  summarized 
the  findings,  we  integrate  them  into  a  measurement  error  framework  so  as  to  understand  what 
they  imply  for  empirical  research  relying  on  subjective  data.  Finally,  in  order  to  calibrate  the 
extent  of  the  measurement  error  problem,  we  perform  some  simple  empirical  work  using  specific 
subjective  questions. 

I.  EVIDENCE  ON  SUBJECTIVE  QUESTIONS 
Cognitive  Problems 


We  begin  by  summarizing  the  experimental  evidence  on  how  cognitive  factors  affect  the  way 
people  answer  survey  questions.1  A  set  of  experiments  has  shown  that  simple  manipulations 
can  affect  how  people  process  and  interpret  questions.  One  first  interesting  manipulation  comes 
from  the  ordering  of  questions:  whether  question  X  is  preceded  by  question  Y  or  vice  versa  can 
substantially  affect  answers.  One  reason  for  this  ordering  effect  is  that  people  attempt  to  provide 
answers  consistent  with  the  ones  they  have  already  given  in  the  survey.  A  second  issue  is  that 
prior  questions  may  elicit  certain  memories  or  attitudes,  which  then  influence  later  answers.  In  a 
striking  study,  respondents  were  asked  two  happiness  questions:  "How  happy  are  you  with  life  in 
general?"  and  "How  often  do  you  normally  go  out  on  a  date?"  When  the  dating  question  came 
first,  the  answers  to  both  were  highly  correlated  but  when  it  came  second,  they  were  basically 
uncorrelated.  Apparently,  the  dating  question  induced  people  to  focus  on  one  aspect  of  their 
life,  an  aspect  that  had  undue  effects  on  their  subsequent  answer. 

Another  cognitive  effect  is  the  importance  of  question  wording.  In  one  classic  example, 
researchers  compared  responses  to  two  questions:  "Do  you  think  the  United  States  should 
forbid  public  speeches  against  democracy?"  and  "Do  you  think  that  the  United  States  should 
allow  public  speeches  against  democracy?"  While  more  than  half  of  the  respondents  stated  that 
yes,  public  speeches  should  be  "forbidden,"  three  quarters  answered  that  no,  public  speeches 
should  not  be  "allowed" .  Evidence  of  such  wording  effects  are  extremely  common. 

Cognitive  problems  also  arise  due  to  the  scales  presented  to  people.  In  an  experiment, 
German  respondents  were  asked  how  many  hours  of  TV  they  were  watching  per  day.  Half  of 
the  respondents  were  given  a  scale  that  begin  with  <  |  an  hour  and  then  proceeded  in  half  hour 


increments  ending  with  4|+  hours.  The  other  respondents  were  given  the  same  scale  except 
the  first  five  answers  were  compressed  so  that  it  began  with  <<  l\  hours.  Only  16  %  of  the 
respondents  given  the  first  set  of  response  alternatives  reported  watching  more  than  two  hours 
and  a  half  of  TV  per  day.  32%  of  the  respondents  given  the  second  set  of  response  alternatives 
reported  watching  more  than  two  hours  and  a  half  of  TV  per  day.  Respondents  thus  appear  to 
be  inferring  "normal"  TV  viewing  from  the  scale.  The  first  scale,  with  a  finer  partition  in  the 
0  —  2  hours  range,  suggests  to  subjects  that  this  amount  of  TV  viewing  is  common.  In  fact, 
stating  that  the  survey's  purpose  is  to  estimate  the  amount  of  TV  viewing  greatly  diminishes 
the  scale  effect. 

An  even  more  fundamental  problem  is  that  respondents  may  make  little  mental  effort  in 
answering  the  question,  such  as  by  not  attempting  to  recall  all  the  relevant  information  or  by 
not  reading  through  the  whole  list  of  alternative  responses.  As  a  consequence,  the  ordering  of 
response  alternatives  provided  matter  since  subjects  may  simply  pick  the  first  or  last  available 
alternatives  in  a  list.  In  the  General  Social  Survey,  for  example,  respondents  are  asked  to  list  the 
most  and  least  desirable  qualities  that  a  child  may  have  out  of  a  list  of  13  qualities.  Researchers 
surveyed  people  and  gave  them  this  list  in  either  the  GSS  order  or  in  reverse  order.  They  found 
that  subjects  would  rate  the  first  or  last  listed  qualities,  whatever  they  were,  as  most  important. 
Social  Desirability 

Beyond  purely  cognitive  issues,  the  social  nature  of  the  survey  procedure  also  appears  to 
play  a  big  role  in  shaping  answers  to  subjective  questioning.  Respondents  want  to  avoid  looking 
bad  in  front  of  the  interviewer.    A  famous  example  is  that  roughly  25%  of  non-voters  report 


having  voted  immediately  after  an  election.  This  over-reporting  is  strongest  among  those  that 
value  norms  of  political  participation  the  most  and  those  who  originally  intended  on  voting. 
Other  studies  have  noted  that  if  one  adds  to  a  voting  question  a  qualifier  that  "Many  people  do 
not  vote  because  something  unexpectedly  arose...,"  the  discrepancy  rate  between  self-reported 
voting  and  actual  voting  drops. 

Another  example  can  be  found  in  the  self-reporting  of  racial  attitude.   Much  evidence  sug- 
gests people  are  unwilling  to  report  prejudice.  For  example,  reported  prejudice  increases  when 
respondents  believe  they  are  being  psychologically  monitored  for  truth  telling  and  decreases 
when  the  survey  is  administered  by  a  black  person. 
Non- Attitudes,  Wrong  Attitudes  and  Soft  Attitudes 

Perhaps  the  most  devastating  problem  with  subjective  questions,  however,  is  the  possibility 
that  attitudes  may  not  "exist"  in  a  coherent  form.  A  first  indication  of  such  problems  is  that 
measured  attitudes  are  quite  unstable  over  time.  For  example,  in  two  surveys  spaced  a  few 
months  apart,  the  same  subjects  were  asked  about  their  views  on  government  spending.  Amaz- 
ingly, 55%  of  the  subjects  reported  different  answers.  Such  low  correlations  at  high  frequencies 
are  quite  representative. 

Part  of  the  problem  comes  from  respondents'  reluctance  to  admit  lack  of  an  attitude.  Simply 
because  the  surveyor  is  asking  the  question,  respondents  believe  that  they  should  have  an  opinion 
about  it.  For  example,  researchers  have  shown  that  large  minorities  would  respond  to  questions 
about  obscure  or  even  fictitious  issues,  such  as  providing  opinions  on  countries  that  don't  exist. 

A  second,  more  profound,  problem  is  that  people  may  often  be  wrong  about  their  "attitudes" . 


People  may  not  really  be  good  at  forecasting  their  behavior  or  understanding  why  they  did  what 
they  did.  In  a  well-known  experiment,  subjects  are  placed  in  a  room  where  two  ropes  are  hanging 
from  the  ceiling  and  are  asked  to  tie  the  two  ropes  together.  The  two  ropes  are  sufficiently  far 
apart  than  one  cannot  merely  grab  one  by  the  hand  and  then  grab  the  other  one.  With  no 
other  information,  few  of  the  subjects  are  able  to  solve  the  problem.  In  a  treatment  group,  the 
experimenter  accidentally  bumps  into  one  of  the  ropes,  setting  it  swinging.  Many  more  people 
solve  the  problem  in  this  case:  subjects  now  see  that  they  can  set  the  ropes  swinging  and  grab 
one  on  an  up  arc.  Yet  when  they  are  debriefed  and  asked  how  they  solved  the  problem,  few  of 
the  subjects  recognize  that  it  was  the  jostling  by  the  experimenter  that  led  them  to  the  solution. 

A  final  and  related  problem  is  cognitive  dissonance.  Subjects  may  report  (and  even  feel) 
attitudes  that  are  consistent  with  their  behavior  and  past  attitudes.  In  one  experiment,  in- 
dividuals are  asked  to  perform  a  tedious  task  and  then  paid  either  very  little  or  a  lot  for  it. 
When  asked  afterwards  how  they  liked  the  task,  those  who  are  paid  very  little  report  greater 
enjoyment.  They  likely  reason  to  themselves,  "If  I  didn't  enjoy  the  task,  why  would  I  have  done 
it  for  nothing?"  Rather  than  admit  that  they  should  just  have  told  the  experimenter  that  they 
were  leaving,  they  prefer  to  think  that  the  task  was  actually  interesting.  In  this  case,  behavior 
shapes  attitudes  and  not  the  other  way  around. 
II.  A  MEASUREMENT  ERROR  PERSPECTIVE 

What  do  these  findings  imply  for  statistical  work  using  subjective  data?  Let  us  adopt  a 
measurement  error  perspective  and  assume  that  reported  attitudes  equal  true  attitudes  plus 
some  error  term,  A  =  A*  +  e.    Statistically,  we  readily  understand  the  case  where  e  is  white 


noise.  The  above  evidence  however  suggests  two  important  ways  in  which  the  measurement 
error  in  attitude  questions  will  be  more  than  white  noise.  First,  the  mean  of  the  error  term  will 
not  necessarily  be  zero  within  a  survey.  For  example,  the  fact  that  a  survey  uses  "forbid"  rather 
than  "allow"  in  a  question  will  affect  answers.  Second,  many  of  the  findings  in  the  literature 
suggest  that  the  error  term  will  be  correlated  with  observable  and  unobservable  characteristics 
of  the  individual.  For  example,  the  misreporting  of  voting  is  higher  in  certain  demographic 
groups  (e.g.  those  that  place  more  social  value  on  voting). 

There  are  two  types  of  analysis  that  can  be  performed  with  subjective  variables:  using 
attitudes  to  explain  behavior  or  explaining  attitudes  themselves.  We  will  examine  how  mismea- 
surement  affects  both  types  of  analyses.  First,  suppose  we  are  interested  in  using  self-reported 
attitudes  to  explain  behavior.  Specifically,  suppose  we  estimate  Yit  —  a  +  bXit  +  cAit,  while 
the  true  model  is  Yu  —  a  +  (3Xit  +  jA*t  +  5Zit,  where  i  represents  individuals,  t  represents 
time,  Y  represents  an  outcome  of  interest,  X  represents  observable  characteristics,  Z  represents 
unobservable  characteristics,  and  we  assume  for  simplicity  that  Z  is  orthogonal  to  X.  How  will 
the  estimated  coefficient  c  compare  to  7  given  what  we  have  learned  about  measurement  error 
in  attitude  questions? 

White  noise  in  the  measurement  of  A  will  produce  an  attenuation  bias,  i.e.  a  bias  towards 
zero.  The  first  measurement  problem  listed  above,  a  survey  fixed  effect,  will  produce  no  bias 
as  long  as  the  appropriate  controls  (such  as  year  or  survey  specific  dummies)  are  included. 
The  second  problem,  correlation  with  individual  characteristics  X  and  Z  will  create  a  bias: 
c  will  now  include  both  the  true  effect  of  attitude  and  the  fact  that  the  measurement  error 


in  A  is  correlated  with  unobservables.  Hence,  assuming  that  measurement  error  problems  are 
not  dominant,  subjective  variables  can  be  useful  as  control  variables  but  care  must  be  taken  in 
interpreting  it.  The  estimated  coefficient  does  not  only  capture  the  effect  of  attitude  but  also  the 
effect  of  other  variables  that  influence  how  the  attitude  is  self-reported.  This  is  closely  related 
to  the  causality  problem  that  we  often  encounter  even  with  perfectly  measured  variables.2 

Let  us  now  turn  to  the  second  type  of  analysis,  where  we  are  attempting  to  explain  attitudes 
themselves.  For  example,  we  might  ask  whether  high  work  hours  increase  loneliness.  Specifically, 
suppose  we  estimate  Ait  =  a  +  bXit  +  e,  while  the  true  model  is  A*t  =  a  +  f3Xit  +  -yZit. 

In  this  setup,  the  white  noise  in  the  measurement  of  attitudes  no  longer  causes  bias.  But  the 
other  biases  now  play  a  much  more  important  role.  Specifically,  the  fact  that  measurement  error 
is  correlated  with  individual  characteristics  will  now  severely  bias  X.  For  example,  suppose  we 
see  that  those  from  rich  backgrounds  have  a  greater  preference  for  money.  As  noted  earlier,  this 
might  simply  reflect  the  fact  that  a  rich  background  affects  the  reporting  of  the  preference  for 
money.  Such  a  correlation  could  thus  be  purely  spurious.  Notice  that  this  problem  is  far  more 
severe  than  in  the  previous  analysis.  First,  the  fact  that  an  X  helps  predict  "attitude"  means 
very  little  if  it  is  only  predicting  the  measurement  error  in  attitude.  So,  one  cannot  argue  as  one 
did  before  that  simply  helping  to  predict  is  a  good  thing,  irrespective  of  causality.  Second,  this 
is  a  problem  that  is  much  harder  to  solve  than  an  omitted  variable  bias  problem.  For  example, 
it  is  hard  to  see  how  an  instrumental  variable  could  resolve  this  issue.  One  would  need  an 
instrument  that  affects  X  but  not  the  measurement  of  attitude.  But  the  above  evidence  tells  us 
that  X  will  likely  affect  measurement  in  a  causal  sense.   This  makes  it  very  unlikely  that  such 


an  instrument  could  be  found  in  most  contexts. 

To  summarize,  interpreting  the  experimental  evidence  in  a  measurement  error  framework 
provides  two  important  insights.  First,  if  the  measurement  error  is  small  enough,  subjective 
measures  may  be  helpful  as  independent  variables  in  predicting  outcomes,  with  the  caveat  that 
the  coefficients  must  be  interpreted  with  care.  Second,  subjective  variables  cannot  reasonably  be 
used  as  dependent  variables  given  that  the  measurement  error  likely  correlates  in  a  very  causal 
way  with  the  explanatory  variables. 
III.  HOW  MUCH  NOISE  IS  THERE? 

This  leaves  the  important  quantitative  question:  how  much  white  noise  error  is  there  in 
the  subjective  questions  we  might  be  interested  in?  Can  we  in  fact  gain  anything  by  adding 
responses  to  subjective  questions  to  our  econometric  models? 

To  assess  this,  we  turn  to  the  High  School  &:  Beyond's  Senior  sample,  which  surveyed  seniors 
in  school  in  1980  and  then  followed  them  every  two  years  until  1986.  This  sample  provides  us 
with  a  set  of  subjective  and  objective  variables  in  each  of  these  waves. 

In  the  first  8  columns  of  Table  1,  we  correlate  answers  to  a  set  of  attitude  variables  with 
future  income  (thereby  removing  mechanical  correlations  with  current  income).  Each  cell  in  the 
Table  corresponds  to  a  separate  regression.  The  dependent  variable  is  log(salary)  in  1985.  In 
row  1,  we  add  as  control  the  sex,  race  and  educational  attainment  of  the  respondent.  Answers 
to  the  subjective  questions  clearly  help  predict  individual  income.  A  set  of  correlations  are 
very  intuitive.  People  that  value  money  or  a  steady  job  more  earn  more.  People  that  value 
social  goals  such  as  correcting  inequalities  around  them  earn  less.   People  that  have  a  positive 


attitude  towards  themselves  earn  more.  Maybe  somewhat  intriguing,  we  find  that  people  that 
care  about  their  family  earn  substantially  more.  Even  more  intriguing,  people  that  value  leisure 
time  also  earn  more.  The  second  row  shows  that  respondents'  attitudes  do  not  simply  proxy 
for  objective  family  background  characteristics.  Controlling  for  parents'  education  and  family 
income  in  the  senior  year  does  not  weaken  the  predictive  power  of  the  attitude  variables.  In  row 
3,  we  show  that  attitude  questions  stay  predictive  of  future  income  even  after  one  controls  for 
current  individual  income. 

As  a  whole,  these  results  suggest  that  noise  does  not  dominate  the  measurement  of  these  sub- 
jective questions.  Attitudes  actually  predict  income  even  beyond  past  income  and  background 
characteristics.  Of  course,  we  are  not  arguing  for  causality,  merely  that  attitude  variables  add 
explanatory  power. 

Finally,  one  might  wonder  to  what  extent  these  variables  are  conveying  any  information 
beyond  fixed  individual  characteristics.  In  row  4,  we  exploit  the  panel  nature  of  the  High 
School  and  Beyond  survey.  We  rerun  the  standard  regressions  with  lagged  attitude  measures 
but  also  add  person  fixed  effects.  Most  of  the  effects  previously  discussed  disappear,  except  for 
the  importance  of  work  and  the  importance  of  having  a  steady  job  (marginally  significant).  It 
therefore  does  not  appear  that  changes  in  attitudes  have  as  much  predictive  power  as  attitudes 
themselves.  Thus,  while  these  attitude  questions  are  helpful  in  explaining  fixed  differences 
between  individuals,  changes  in  reported  attitudes  are  not  helpful  in  explaining  changes  in 
outcomes. 

In  column  9,  we  investigate  whether  answers  to  reservation  wage  questions  are  correlated 


with  future  income.  Are  individuals  that  report  higher  reservation  wage  today  likely  to  earn 
more  in  the  future?  We  see  a  very  strong  relationship  between  reservation  wage  and  future 
income,  even  after  controlling  for  the  individual's  education,  sex  and  race.  This  holds  true  even 
if  we  add  controls  for  family  background  (row  2)  or  family  background  and  current  income  (row 
3).  However,  changes  in  reported  reservation  wages  do  not  help  predict  changes  in  income  (row 
4).  In  summary,  answers  to  reservation  wage  questions  do  appear  to  capture  some  unobserved 
individual  characteristics  and  might  be  worth  including  when  trying  to  predict  individual  income. 
Changes  in  reported  reservation  wages  however  provide  no  information  about  changes  in  income. 

Finally,  in  column  10,  we  ask  whether  answers  to  job  satisfaction  questions  help  predict 
future  job  turnover.  Again,  we  find  that  people's  self- reported  satisfaction  with  their  job  "as  a 
whole"  is  a  strong  predictor  of  their  probability  of  changing  job  or  not  in  the  future.3 
IV.  CONCLUSION 

Four  main  messages  emerge  from  this  discussion.  First,  a  large  experimental  literature  by 
and  large  supports  economists'  skepticism  of  subjective  questions.  Second,  put  in  an  econometric 
framework,  these  findings  cast  serious  doubts  on  attempts  to  use  subjective  data  as  dependent 
variables  because  the  measurement  error  appears  to  correlate  with  a  large  set  of  characteristics 
and  behaviors.  For  example,  a  drop  in  reported  racism  over  time  may  simply  reflect  an  increased 
reluctance  to  report  racism.  Since  much  of  the  interesting  applications  would  likely  use  these 
data  as  dependent  variables,  this  is  a  rather  pessimistic  conclusion.  Third,  and  on  a  brighter 
note,  these  data  may  be  useful  as  explanatory  variables.  One  must,  however,  take  care  in 
interpreting  the  results  since  the  findings  may  not  be  causal.  Finally,  our  empirical  work  suggests 
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that  subjective  variables  are  in  practice  useful  for  explaining  differences  in  behavior  across 

individuals.  Changes  in  answers  to  these  questions,  however,  do  not  appear  useful  in  explaining 

changes  in  behavior. 
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TABLE   1:    Effect  of  Attitude  Questions   on  Future  Outcomes'1 

Dependent    Variable: 


Log   Wage 

Log   Wage 

Log    Wage 

Log    Wage 

Log   Wage 

Log   Wage 

Log   Wage 

Log   Wage 

Log    Wage 

Stayer 

Value 

Value 

Value 

Value 

Value 

Value 

Value 

Positive 

Reservation 

Satisfied 
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Notes; 

1.  Data  Source:  High  School  and  Beyond  (Seniors).  Demographic  characteristics  include  education,  sex  and  race.  Family  background  characteristics 
include  father  education,  mother  education  and  family  income  in  senior  year  (7  categories).  "Stayer"  is  a  dummy  variable  which  equals  one  if  there 
os  no  job  change  between  second  and  third  follow  up.    "Work" 

2.  Each  cell  corresponds  to  a  separate  regression.   Standard  errors  are  in  parentheses. 

3.  Except  in  row  4,  outcomes  are  from  the  third  follow-up  survey  and  attitudes  are  from  the  second  follow-up.  Row  4  reports  panel  regressions  on  all 
available  survey  periods.   The  regressions  in  row  4  include  survey  fixed  effects. 
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Footnotes 

*  Graduate  School  of  Business,  University  of  Chicago,  NBER  and  CEPR;  MIT  and  NBER. 

1.  Due  to  space  constraints,  we  will  just  mention  two  books  that  are  a  good  source  for  a 
review  of  the  experimental  evidence:  Tanur  (1992)  and  Sudman,  Bradburn  and  Schwarz 
(1996).  A  fuller  list  of  references  can  be  gotten  in  the  full  version  of  this  paper  (Bertrand 
and  Mullainathan  2000). 

2.  An  extreme  example  of  this  occurs  when  the  measurement  error  is  correlated  with  the 
variable  of  interest  itself  as  is  suggested  by  cognitive  dissonance.  For  example,  people  may 
report  a  lower  preference  for  money  if  they  are  making  less  money.  This  is  a  case  of  pure 
reverse  causation. 

3.  In  this  case,  we  are  not  able  to  study  a  fixed  effect  model  as  the  job  satisfaction  question 
was  only  asked  in  the  second  and  third  follow  up  of  the  data. 
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